Clear Environment

Good practice to keep the environment clean and start fresh.

NOTE: Not included in knit

Load Packages

NOTE: Not included in knit

Load Data

NOTE: Use data prep scripts first before loading data! Will need to run the Arbimon Data Prep Script to get the Recordings.csv file. NOTE: Not included in knit

Run for all
Arbimon
BirdNet
Kaleidoscope

Data Wrangling - Yeehaw!

NOTE: Not included in knit

Arbimon
Kaleidoscope

Site Map

Stations are interactive. Can click the points and it will give you station name. There are also several basemap options. Toggle in between them for added fun.

pal=colorFactor(palette = "viridis", domain = NULL)

leaflet(recording_metadata) %>%
  addProviderTiles("OpenStreetMap", group = "OpenStreetMap") %>% 
  addProviderTiles("CartoDB.Positron", group = "CartoDB.Positron") %>% 
  addProviderTiles("Esri.WorldImagery", group = "Esri.WorldImagery") %>% 
  addProviderTiles("Esri.WorldTopoMap", group = "Esri.WorldTopoMap") %>% 
  addProviderTiles("NASAGIBS.ViirsEarthAtNight2012", group = "NASAGIBS.ViirsEarthAtNight2012") %>%  
  addLayersControl(baseGroups = c("Esri.WorldImagery","OpenStreetMap", "CartoDB.Positron", "Esri.WorldTopoMap", "NASAGIBS.ViirsEarthAtNight2012"), 
                   position = "topleft")%>%
  addCircleMarkers(~longitude, ~latitude,radius=5,
    fillColor = ~pal(name),
    stroke = TRUE, 
    color="white",
    weight=0.75,
    fillOpacity = 0.75,
    popup= ~name) %>% 
  addLegend("bottomleft", pal = pal, values = ~name, opacity = .98,
    title = "ARU Location") %>% 
  fitBounds(
    lng1 = min(recording_metadata$longitude), 
    lat1 = min(recording_metadata$latitude), 
    lng2 = max(recording_metadata$longitude), 
    lat2 = max(recording_metadata$latitude)
  )

Recording Schedule Graphics

Visualization of recording periods per year. This is based off of the recording data extracted from Arbimon, but applies to all the analyzed data. It is the backbone information to put all the indices and species IDs in context.

# 2023 Recording Schedule
ggplot(recording_metadata %>% filter (Year == "2023", name != "TEST Bait Site (Throwaway)"), aes(x = Julian_Day, y = name)) +
  geom_segment(aes(xend = Julian_Day, yend = name), color = "gray") +  
  geom_point(aes(color = as.factor(name)), size = 4) +  
  scale_color_viridis_d() +  
  scale_x_continuous(breaks = seq(0, 365, by = 5)) + 
  labs(x = "Julian Day", y = "Recorder (Site ID)",
       title = "Julian Days Each Recorder Was Active in 2023") +
  theme_minimal(base_size = 15) +
  theme(panel.grid.major.y = element_blank())+
  theme(legend.position = "none")

# 2024 Recording Schedule
ggplot(recording_metadata %>% filter (Year == "2024"), aes(x = Julian_Day, y = name)) +
  geom_segment(aes(xend = Julian_Day, yend = name), color = "gray") +  
  geom_point(aes(color = as.factor(name)), size = 4) +  
  scale_color_viridis_d() +  
  scale_x_continuous(breaks = seq(0, 365, by = 5)) + 
  labs(x = "Julian Day", y = "Recorder (Site ID)",
       title = "Julian Days Each Recorder Was Active in 2024") +
  coord_cartesian(xlim = c(115, 255)) +
  theme_minimal(base_size = 15) +
  theme(panel.grid.major.y = element_blank())+  
  theme(legend.position = "none")

Graphics!!!

Arbimon

Presence/Absence

Come back here and add some information :)

# Load the image
sparrow_image <- png::readPNG("C:/Users/kfaller/Documents/GitHub/Acoustics/Images/SALS.png")  # Load your image
sparrow_grob <- rasterGrob(sparrow_image, interpolate = TRUE)

# SALS

  # 2024
  ggplot(species_IDs %>% filter(scientific_name == "Ammospiza caudacuta", Year == "2024"), 
         aes(x = Julian_Day, y = name, fill = factor(validated))) +
    geom_tile(color = "white") +  # Change tile border color to white for a cleaner look
    scale_fill_manual(values = c("0" = "gray", "1" = "darkgreen"), name = "Presence") + 
    scale_x_continuous(breaks = seq(0, 365, by = 5)) +  # Use a color palette
    labs(x = "Julian Day", y = "Site ID", title = "2024 Saltmarsh Sparrow Presence/Absence") +
    theme_bw(base_size = 15)+
    annotation_custom(sparrow_grob, xmin=160, xmax=175, ymin=0, ymax=3)

  # 2023 THIS DATA SEEMS OFF. I HAD 28 SALS LAST YEAR
  ggplot(species_IDs %>% filter(scientific_name == "Ammospiza caudacuta", Year == "2023"), 
         aes(x = Julian_Day, y = name, fill = factor(validated))) +
    geom_tile(color = "white") +  # Change tile border color to white for a cleaner look
    scale_fill_manual(values = c("0" = "gray", "1" = "darkgreen"), name = "Presence") + 
    scale_x_continuous(breaks = seq(0, 365, by = 5)) +  # Use a color palette
    labs(x = "Julian Day", y = "Site ID", title = "2023 Saltmarsh Sparrow Presence/Absence") +
    theme_bw(base_size = 15)+
    annotation_custom(sparrow_grob, xmin=160, xmax=175, ymin=0, ymax=3)

#NESP
  
 # 2023
  ggplot(species_IDs %>% filter(scientific_name == "Ammodramus nelsoni", Year == "2023"), 
         aes(x = Julian_Day, y = name, fill = factor(validated))) +
    geom_tile(color = "white") +  # Change tile border color to white for a cleaner look
    scale_fill_manual(values = c("0" = "gray", "1" = "darkgreen"), name = "Presence") + 
    scale_x_continuous(breaks = seq(0, 365, by = 5)) +  # Use a color palette
    labs(x = "Julian Day", y = "Site ID", title = "2023 Nelson's Sparrow Presence/Absence") +
    theme_bw(base_size = 15)+
    annotation_custom(sparrow_grob, xmin=155, xmax=170, ymin=0, ymax=3) 

Kaleidoscope Acoustic Index Graphics

We are using Wildlife Acoustics’ Kaleidoscope software to take our raw audio files and output numerical values for different aspects of the soundscape for each file. Those values were then averaged to reduce the noise of the data. The indices we are including are:

  • BI: Bioacoustic Index
  • ADI: Average Diversity Index
  • AEI: Average Evenness Index
  • ACI: Acoustic Complexity Index
  • NDSI: Normalized Difference Soundscape Index aka Anthropogenic Disturbance

The ACI (Pieretti et al. 2011) measures the variability within biotic sounds while ignoring anthropogenic noise. BI (Boelman et al. 2007) measures the number of frequency bands used by birds and is a proxy for avian abundance. NDSI (Kasten et al. 2012), which will be of considerable importance to our urban-to-rural gradient objectives, estimates the level of anthropogenic disturbance (i.e., noise pollution) in the soundscape. This model creates a ratio of human-generated to biological sound. Finally, ADI (Villanueva-Rivera et al. 2011), is derived from Shannon Diversity and calculates the diversity of sound. After our deployments, a large dataset will be produced, tracking these indices across our sites over time.

Averages for Indices

Accounting for two microphones (= 2 values for each time period) & creating daily average

## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Daily average acoustic complexity across sites. Higher numbers represent a more complex soundscape.

Daily average acoustic complexity across sites. Higher numbers represent a more complex soundscape.

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Daily average acoustic evenness across sites. Higher numbers represent a more even soundscape. A healthy ecosystem should have low evenness. High evenness scores could be reflecting high rates of anthropogenic disturbance or low biodiversity. The y axis is on a natural log scale.

Daily average acoustic evenness across sites. Higher numbers represent a more even soundscape. A healthy ecosystem should have low evenness. High evenness scores could be reflecting high rates of anthropogenic disturbance or low biodiversity. The y axis is on a natural log scale.

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Daily average acoustic diversity across sites. Higher numbers represent a more diverse soundscape. A healthy ecosystem should have high diversity. High diversity scores could be reflecting a more biodiverse community. The y axis is on a natural log scale with a 2.3 being the highest diversity possible. This index is the closest to Shannon Diversity. Includes all recorded sounds (including any anthropogenic disturbance) giving a value for the entire soundscape, rather than just the biological diversity.

Daily average acoustic diversity across sites. Higher numbers represent a more diverse soundscape. A healthy ecosystem should have high diversity. High diversity scores could be reflecting a more biodiverse community. The y axis is on a natural log scale with a 2.3 being the highest diversity possible. This index is the closest to Shannon Diversity. Includes all recorded sounds (including any anthropogenic disturbance) giving a value for the entire soundscape, rather than just the biological diversity.

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Daily average acoustic biodiversity across sites. Higher numbers represent a more diverse soundscape. This differs from the diversity index because it excludes anthropogenic disturbance making this a better index to understand the biophony of a site (or biological soundscape).

Daily average acoustic biodiversity across sites. Higher numbers represent a more diverse soundscape. This differs from the diversity index because it excludes anthropogenic disturbance making this a better index to understand the biophony of a site (or biological soundscape).

## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Daily average anthropogenic disturbance across sites. Lower numbers represent a more anthropogenic disturbance in the soundscape. -1 represents 100% anthropogenic sounds and 1 would be 100% biological sounds.

Daily average anthropogenic disturbance across sites. Lower numbers represent a more anthropogenic disturbance in the soundscape. -1 represents 100% anthropogenic sounds and 1 would be 100% biological sounds.

Boxplots for Indices

Cowplot for Average Indices

One graphic showing most important acoustic indices in one overview.

day_avg_aci<- ggplot(data= day_avg_index, aes(x= DAY, y= avg_ACI, color= STATION, shape= Site_Type))+ 
         geom_line(size=1)+
         geom_point(size= 3)+
         theme_bw(base_size=18)+
         ggtitle("Daily Average Indices")+
         labs(x= " ", y= "ACI - Complexity")+ 
         scale_x_datetime(date_breaks =  "1 day")+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
         guides(color = guide_legend(title = "Station"), shape = guide_legend(title = "Site Type"))

day_avg_bi<- ggplot(data= day_avg_index, aes(x= DAY, y= avg_BI, color= STATION, shape= Site_Type))+ 
         geom_line(size=1)+
         geom_point(size= 3)+
         theme_bw(base_size=18)+
         labs(x= " ", y= "BI - Bioacoustics")+ 
         scale_x_datetime(date_breaks =  "1 day")+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
         guides(color = guide_legend(title = "Station"), shape = guide_legend(title = "Site Type"))

day_avg_ndsi<- ggplot(data= day_avg_index, aes(x= DAY, y= avg_NDSI, color= STATION, shape= Site_Type))+ 
         geom_line(size=1)+
         geom_point(size= 3)+
         theme_bw(base_size=18)+
         labs(x= " ", y= "NDSI - Anthropogenic Disturbance")+ 
         scale_x_datetime(date_breaks =  "1 day")+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))+
         guides(color = guide_legend(title = "Station"), shape = guide_legend(title = "Site Type"))+ scale_y_reverse()

plot_grid(day_avg_aci, 
         day_avg_bi,
         day_avg_ndsi,
          nrow= 3, align = "v", axis="l",
          labels="AUTO")
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_line()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

LeeAnn Graphics

Boxplot

my_comparisons <- list(c("GAR_01", "RF_02"),
                       c("GAR_01", "RF_01"),
                       c("GAR_02", "RF_02"),
                       c("GAR_04", "RF_02"),
                       c("GAR_02", "RF_01"),
                       c("GAR_01", "GAR_04"), 
                       c("RF_01", "RF_02"),
                       c("GAR_04", "RF_01"), 
                       c("GAR_04", "GAR_02"),
                       c("GAR_01", "GAR_02")
                       )

box_avg_aci<- ggplot(day_avg_index, aes(x=STATION, y=avg_ACI, color= STATION))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "ACI - Complexity")+ 
         guides(color=FALSE)+
         stat_compare_means(label="p.format", method="t.test", comparisons = my_comparisons,
                            symnum.args = list(cutpoints = c(0, 0.05, 1), symbols = c("**", "-")))+  
         stat_compare_means(method = "anova", label.y = +Inf, vjust=1,hjust=-0.4)+
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))
## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
## of ggplot2 3.3.4.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
box_avg_bi<- ggplot(day_avg_index, aes(x=STATION,y= avg_BI, color= STATION))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "BI - Bioacoustics")+ 
         guides(color=FALSE)+
         stat_compare_means(label="p.format", method="t.test", comparisons = my_comparisons,
                            symnum.args = list(cutpoints = c(0, 0.05, 1), symbols = c("**", "-")))+ 
         stat_compare_means(method = "anova", label.y = +Inf, vjust=1,hjust=-0.5)+       
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

box_avg_ndsi<- ggplot(day_avg_index, aes(x=STATION,y= avg_NDSI, color= STATION))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "NDSI - Anthropogenic Disturbance")+ 
         guides(color=FALSE)+
         stat_compare_means(label="p.format", method="t.test", comparisons = my_comparisons,
                            symnum.args = list(cutpoints = c(0, 0.05, 1), symbols = c("**", "-")))+ 
         stat_compare_means(method = "anova", label.y = +Inf, vjust=1,hjust=-0.4)+
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

plot_grid(box_avg_aci, 
         box_avg_bi,
         box_avg_ndsi,
          nrow= 1, align = "v", axis="l",
          labels="AUTO")
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_compare_means()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_compare_means()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_compare_means()`).

Relying on LeeAnn’s CSV. Need to add in the code to create the time of day column in the R code.

my_comparisons1 <- list(c("morning", "mid-day"),
                       c("morning", "evening"),
                       c("morning", "night"),
                       c("mid-day", "evening"),
                       c("evening", "night"),
                       c("mid-day", "night"))

box_avg_aci_tod<- ggplot(index_timeofday, aes(x=Twilight, y=avg_ACI, fill=Twilight))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "ACI - Complexity")+ 
         guides(fill=FALSE)+
         stat_compare_means(method="t.test",comparisons = my_comparisons1)+ 
         scale_fill_manual(values=c("lightsteelblue","skyblue3","dodgerblue4","navyblue"))+
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

box_avg_bi_tod<- ggplot(index_timeofday, aes(x=Twilight,y= avg_BI, fill=Twilight))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "BI - Bioacoustics")+ 
         guides(fill=FALSE)+
         stat_compare_means(method="t.test",comparisons = my_comparisons1)+ 
         scale_fill_manual(values=c("lightsteelblue","skyblue3","dodgerblue4","navyblue"))+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

box_avg_ndsi_tod<- ggplot(index_timeofday, aes(x=Twilight,y= avg_NDSI, fill=Twilight))+ 
         geom_boxplot()+
         theme_bw(base_size=15)+
         labs(x= " ", y= "NDSI - Anthropogenic Disturbance")+ 
         guides(fill=FALSE)+
         stat_compare_means(method="t.test",comparisons = my_comparisons1)+ 
         scale_fill_manual(values=c("lightsteelblue","skyblue3","dodgerblue4","navyblue"))+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

plot_grid(box_avg_aci_tod, 
         box_avg_bi_tod,
         box_avg_ndsi_tod,
          nrow= 1, align = "v", axis="l",
          labels="AUTO")
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_signif()`).

ggplot(filter(index_timeofday, Twilight!="night"&STATION%in%c("GAR_01","GAR_02","GAR_04")), aes(x=DATE, y=avg_ACI, fill=Twilight))+
  geom_bar(stat="identity")+
  coord_polar()+
  theme_bw()+
  scale_x_date(labels=date_format("%b-%d"),date_breaks =  "1 day")+
  scale_fill_manual(values=c("lightsteelblue","skyblue3","dodgerblue4","navyblue"))+
  labs(title="Restoration sites") 

ggplot(filter(index_timeofday, Twilight!="night"&STATION%in%c("RF_01","RF_02")), aes(x=DATE, y=avg_ACI, fill=Twilight))+
  geom_bar(stat="identity")+
  coord_polar()+
  theme_bw()+
  scale_x_date(labels=date_format("%b-%d"),date_breaks =  "1 day")+
  scale_fill_manual(values=c("lightsteelblue","skyblue3","dodgerblue4","navyblue"))+
  labs(title="Reference sites")

ggplot(index_timeofday, aes(x=DATE, y=avg_ACI, color=STATION))+
  facet_grid(STATION~Twilight)+
  geom_point(aes(size=avg_ACI))+
  theme_bw()+
  theme(legend.position="bottom")+
  labs(y="Mean ACI - complexity", x="Julian day", color="")+
  scale_x_date(date_labels = "%j")+
  scale_size_continuous(range = c(1, 3.5))+
  guides(size=FALSE, color=FALSE)
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).

a<-ggplot(index_data,aes(x=DATE,y=hour(TIME),fill=ACI))+
  facet_wrap(~STATION, ncol=1)+
  geom_raster()+
  theme_classic()+
  theme(legend.position="bottom")+
  scale_fill_viridis(option ="turbo",limits = c(92, 1565))+
  scale_x_date(date_labels = "%j")+
  labs(x="Julian day", y="Hour of the day", fill="ACI", title="Complexity")

b<-ggplot(index_data,aes(x=DATE,y=hour(TIME),fill=BI))+
  facet_wrap(~STATION, ncol=1)+
  geom_raster()+
  theme_classic()+
  theme(legend.position="bottom")+
  scale_fill_viridis(option ="rocket",limits = c(14,251))+
  scale_x_date(date_labels = "%j")+
  labs(x="Julian day", y="Hour of the day", fill="BI", title="Bioacoustics")

c<-ggplot(index_data,aes(x=DATE,y=hour(TIME),fill=NDSI))+
  facet_wrap(~STATION, ncol=1)+
  geom_raster()+
  theme_classic()+
  theme(legend.position="bottom")+
  scale_fill_viridis(option ="mako",limits = c(-1,1))+
  scale_x_date(date_labels = "%j")+
  labs(x="Julian day", y="Hour of the day", fill="NDSI", title="Anthropogenic Disturbance")

plot_grid(a, 
         b,
         c,
          ncol= 3,
          labels="AUTO")
## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.
## Warning: Removed 2042 rows containing missing values or values outside the scale range
## (`geom_raster()`).
## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.
## Warning: Removed 2042 rows containing missing values or values outside the scale range
## (`geom_raster()`).
## Warning: Raster pixels are placed at uneven horizontal intervals and will be shifted
## ℹ Consider using `geom_tile()` instead.
## Warning: Removed 2042 rows containing missing values or values outside the scale range
## (`geom_raster()`).

BirdNet Species Summaries

These data are coming from Cornell’s BirdNET Analyzer. It is essentially a GUI version of their popular Merlin app. It requires a lot of decisions on how to analyze your recordings. The user selects confidence levels, sensitiviy, and overlap. The data crunched below is the first deployment (n=10) and is using very lax confidence levels. It will require manual verification (especially for rare species) and the accuracy should be tested. However, we are currently in the stage of seeing how we can best visualize the data. In the near future, we will be testing difference methods to get the most accurate representation of species presence and community composition.

The summary below counts the amount of unique species identified by BirdNET at each site. These data will be verified to eliminate false detections. The software does a good job of identifying common birds but will occasionally include a few ecologically unreasonable detections. These generally account for very few detections (i.e. <0.01% of all detections), so the numbers appear more inflated in the summaries.

n_distinct(species_data_birdnet$Scientific_Name)
## [1] 288
setDT(species_data_birdnet)[, .(count = uniqueN(Scientific_Name)), by = Station]
##    Station count
##     <char> <int>
## 1:  GAR_01   227
## 2:  GAR_02   199
## 3:   RF_01   251
## 4:   RF_02   222
setDT(species_data_birdnet)[, .(count = uniqueN(Scientific_Name)), by = Site]
##            Site count
##          <char> <int>
## 1:     Garrison   254
## 2: Russell Farm   267
guild_count<- setDT(species_data_birdnet)[, .(count = uniqueN(Scientific_Name)), by = Guild]
species_breakdown <- species_data_birdnet %>%
    group_by(Station, Scientific_Name) %>%
    summarize(Count = n()) %>%
    mutate(Percent = round(Count / sum(Count) * 100, 2))
## `summarise()` has grouped output by 'Station'. You can override using the
## `.groups` argument.
species_over_1p<- species_breakdown %>% 
  mutate(Percent_1p = case_when(
    Percent > 0.01~ "Above",
    Percent  < 0.01 ~ "Below",
    Percent == 0.01 ~ "At")
    %>% as.factor())

species_data_birdnet2 <- species_data_birdnet %>% 
  mutate(Site_Type = case_when(
    Station == "RF_01" ~ "Reference",
    Station == "RF_02" ~ "Reference",
    Station == "GAR_01" ~ "Wild Card",
    Station == "GAR_02" ~ "Restoration",
    Station == "GAR_03" ~ "Wild Card",
    Station == "GAR_04" ~ "Restoration")
    %>% as.factor())

RF_01_spps_breakdown <- species_data_birdnet %>%
    filter(Station == "RF_01") %>%
    group_by(Scientific_Name) %>%
    summarize(Count = n()) %>%
    mutate(Percent = round(Count / sum(Count) * 100, 2))
RF_02_spps_breakdown <- species_data_birdnet %>%
    filter(Station == "RF_02") %>%
    group_by(Scientific_Name) %>%
    summarize(Count = n()) %>%
    mutate(Percent = round(Count / sum(Count) * 100, 2))
GAR_01_spps_breakdown <- species_data_birdnet %>%
    filter(Station == "GAR_01") %>%
    group_by(Scientific_Name) %>%
    summarize(Count = n()) %>%
    mutate(Percent = round(Count / sum(Count) * 100, 2))
GAR_02_spps_breakdown <- species_data_birdnet %>%
    filter(Station == "GAR_02") %>%
    group_by(Scientific_Name) %>%
    summarize(Count = n()) %>%
    mutate(Percent = round(Count / sum(Count) * 100, 2))

BirdNet Species Outputs

These data are coming from Cornell’s BirdNET Analyzer. It is essentially a GUI version of their popular Merlin app. It requires a lot of decisions on how to analyze your recordings. The user selects confidence levels, sensitiviy, and overlap. The data crunched below is the first deployment (n=10) and is using very lax confidence levels. It will require manual verification (especially for rare species) and the accuracy should be tested. However, we are currently in the stage of seeing how we can best visualize the data. In the near future, we will be testing difference methods to get the most accurate representation of species presence and community composition.

Data is visualized in two ways, common species and guilds. Guilds gives a good overview of how sites vary (i.e. more shorebirds and ducks in a mudflat). Below the graphics will be referencing the same dataset, but either calling out specific common bird species or guild.

Species Diversity

In future, want to annotate the deployment times this represents. Maybe just in title. Currently the species graphics are only for the first deployments of 2023. Running BirdNET requires a lot of time and processing power, so we have only completed one round of processing.Will need to update this if it is ever worth the effort

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Guild breakdown for all sites during the first deployment (4/26/23 to 5/7/23).

Guild breakdown for all sites during the first deployment (4/26/23 to 5/7/23).

## Warning: The dot-dot notation (`..prop..`) was deprecated in ggplot2 3.4.0.
## ℹ Please use `after_stat(prop)` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.
Breakdown of common bird species by site during the first deployment (4/26/23 to 5/7/23). Four-letter alpha codes represent common name of bird species. Seaside sparrow (SESP) is by far the most dominant species with 380,311 total detections.

Breakdown of common bird species by site during the first deployment (4/26/23 to 5/7/23). Four-letter alpha codes represent common name of bird species. Seaside sparrow (SESP) is by far the most dominant species with 380,311 total detections.

Guild breakdown by site during the first deployment (4/26/23 to 5/7/23). Passerines were the most common guild represented in the data due to the high level of SESP detections.

Guild breakdown by site during the first deployment (4/26/23 to 5/7/23). Passerines were the most common guild represented in the data due to the high level of SESP detections.

Cowplot for Species Diversity

Grouping the above graphics by their site types (restoration, reference, wildcard) just to play around with grouping. Will probably group by habitat type in the future. I think that gives more context to what is going on with the species diversity. However, the site type graphics will be important to have during the post-restoration data collection.

# Common Birds

Restore_common <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Restoration"), aes(x= Common_Birds,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.01),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 3) +
    labs(x= " ", y = "Percent", fill="Common_Birds") + 
  guides(fill= FALSE)+
         theme_bw(base_size=15)+
    facet_grid(~Station) +
  labs(subtitle = "Restoration Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Refer_common <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Reference"), aes(x= Common_Birds,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.1),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 2.5) +
    labs(x= " ", y = "Percent", fill="Lower_Guild") + 
  guides(fill= FALSE)+
         theme_bw(base_size=18)+
    facet_grid(~Station) +
  labs(subtitle = "Reference Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Wild_common <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Wild Card"), aes(x= Common_Birds,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.1),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 2.5) +
    labs(x= "Common Bird Species Alpha Code", y = "Percent", fill="Lower_Guild") + 
  guides(fill= FALSE)+
         theme_bw(base_size=18)+
    facet_grid(~Station) +
  labs(subtitle = "Wild Card Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

# By Guild

Restore_guild <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Restoration"), aes(x= Lower_Guild,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.01),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 3) +
    labs(x= " ", y = "Percent", fill="Common_Birds") + 
  guides(fill= FALSE)+
         theme_bw(base_size=15)+
    facet_grid(~Station) +
  labs(subtitle = "Restoration Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Refer_guild <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Reference"), aes(x= Lower_Guild,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.1),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 2.5) +
    labs(x= " ", y = "Percent", fill="Lower_Guild") + 
  guides(fill= FALSE)+
         theme_bw(base_size=18)+
    facet_grid(~Station) +
  labs(subtitle = "Reference Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

Wild_guild <- ggplot(species_data_birdnet2 %>% filter(Site_Type == "Wild Card"), aes(x= Lower_Guild,  group= Station)) + 
    geom_bar(aes(y = ..prop.., fill = factor(..x..)), stat="count") +
    geom_text(aes( label = scales::percent(..prop..,accuracy=0.1),
                   y= ..prop.. ), stat= "count", vjust =- 1.5, size= 2.5) +
    labs(x= "Guild", y = "Percent", fill="Lower_Guild") + 
  guides(fill= FALSE)+
         theme_bw(base_size=18)+
    facet_grid(~Station) +
  labs(subtitle = "Wild Card Sites") +
    scale_y_continuous(labels = scales::percent)+ 
         theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

title1  = "Common Bird Species By Site Type"
title2  = "Bird Guild By Site Type"

plot_grid(Restore_common, 
         Refer_common,
         Wild_common,
         labels = title1, label_size = 14, hjust = -2,
          nrow= 3, align = "v", axis="l")
Common bird species breakdown by site type (Restoration, Reference, Wild Card) during the first deployment of 2023 (4/26/23 to 5/7/23).

Common bird species breakdown by site type (Restoration, Reference, Wild Card) during the first deployment of 2023 (4/26/23 to 5/7/23).

plot_grid(Restore_guild, 
         Refer_guild,
         Wild_guild,
         labels = title2, label_size = 14, hjust = -3,
          nrow= 3, align = "v", axis="l")
Common bird species breakdown by site type (Restoration, Reference, Wild Card) during the first deployment of 2023 (4/26/23 to 5/7/23).

Common bird species breakdown by site type (Restoration, Reference, Wild Card) during the first deployment of 2023 (4/26/23 to 5/7/23).

Summary Tables

Gives further context to the data, but in a tabular form rather than graphically. Broken down by site, station, acoustic indices, site types, and condfidence levels.

st(index_data %>% dplyr::select(c("SITE", "STATION", "NDSI", "ACI", "ADI" , "AEI", "BI")))
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
SITE 100742
… Garrison 55994 56%
… Russell Farm 44748 44%
STATION 100742
… GAR_01 5636 6%
… GAR_02 23782 24%
… GAR_03 5712 6%
… GAR_04 20864 21%
… RF_01 20228 20%
… RF_02 24520 24%
NDSI 98700 0.64 0.39 -0.95 0.38 0.95 1
ACI 98700 720 101 77 657 742 1565
ADI 98700 1.5 0.58 0 1.1 1.9 2.3
AEI 98700 0.58 0.21 0 0.44 0.75 0.9
BI 98700 106 42 16 69 139 250
st(avg_index2 %>% dplyr::select(c("SITE", "STATION", "Site_Type", "avg_NDSI", "avg_ACI", "avg_ADI" , "avg_AEI", "avg_BI")))#,
## Adding missing grouping variables: `DATETIME`, `DATE`
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
SITE 49352
… Garrison 26978 55%
… Russell Farm 22374 45%
STATION 49352
… GAR_01 2818 6%
… GAR_02 11891 24%
… GAR_03 2819 6%
… GAR_04 9450 19%
… RF_01 10114 20%
… RF_02 12260 25%
Site_Type 49352
… Wild Card 5637 11%
… Restoration 21341 43%
… Reference 22374 45%
avg_NDSI 49350 0.64 0.37 -0.94 0.38 0.94 1
avg_ACI 49350 720 92 79 660 747 1332
avg_ADI 49350 1.5 0.54 0.001 1.1 1.9 2.3
avg_AEI 49350 0.58 0.2 0 0.45 0.73 0.9
avg_BI 49350 106 41 20 71 138 235
                                #"sd_NDSI","sd_ACI", "sd_ADI" , "sd_AEI", "sd_BI")))
st(species_data_birdnet2 %>% dplyr::select(c("Site", "Station", "Confidence", "Site_Type")))
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
Site 886670
… Garrison 454358 51%
… Russell Farm 432312 49%
Station 886670
… GAR_01 220267 25%
… GAR_02 234091 26%
… RF_01 175198 20%
… RF_02 257114 29%
Confidence 883874 0.65 0.29 0.17 0.37 0.94 1
Site_Type 886670
… Reference 432312 49%
… Restoration 234091 26%
… Wild Card 220267 25%
st(species_data_birdnet %>% dplyr::select(c("Site", "Station", "Confidence", "Common_Birds")), group= 'Common_Birds', group.long = TRUE)
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
Common_Birds: CAGO
Site 4898
… Garrison 3862 79%
… Russell Farm 1036 21%
Station 4898
… GAR_01 2019 41%
… GAR_02 1843 38%
… RF_01 624 13%
… RF_02 412 8%
Confidence 4898 0.46 0.24 0.17 0.25 0.63 1
Common_Birds: CLRA
Site 131722
… Garrison 72327 55%
… Russell Farm 59395 45%
Station 131722
… GAR_01 16318 12%
… GAR_02 56009 43%
… RF_01 16779 13%
… RF_02 42616 32%
Confidence 131722 0.59 0.26 0.17 0.35 0.84 1
Common_Birds: CWWI
Site 12576
… Garrison 825 7%
… Russell Farm 11751 93%
Station 12576
… GAR_01 823 7%
… GAR_02 2 0%
… RF_01 4182 33%
… RF_02 7569 60%
Confidence 12425 0.83 0.23 0.17 0.74 0.99 1
Common_Birds: EWPW
Site 11127
… Garrison 11113 100%
… Russell Farm 14 0%
Station 11127
… GAR_01 10851 98%
… GAR_02 262 2%
… RF_01 10 0%
… RF_02 4 0%
Confidence 11127 0.82 0.24 0.17 0.73 0.99 1
Common_Birds: GRYE
Site 19381
… Garrison 9941 51%
… Russell Farm 9440 49%
Station 19381
… GAR_01 1852 10%
… GAR_02 8089 42%
… RF_01 3080 16%
… RF_02 6360 33%
Confidence 19381 0.58 0.28 0.17 0.32 0.86 1
Common_Birds: GWTE
Site 25044
… Garrison 15625 62%
… Russell Farm 9419 38%
Station 25044
… GAR_01 852 3%
… GAR_02 14773 59%
… RF_01 7544 30%
… RF_02 1875 7%
Confidence 25044 0.67 0.29 0.17 0.4 0.95 1
Common_Birds: KIRA
Site 14953
… Garrison 5023 34%
… Russell Farm 9930 66%
Station 14953
… GAR_01 1927 13%
… GAR_02 3096 21%
… RF_01 6217 42%
… RF_02 3713 25%
Confidence 14953 0.41 0.23 0.17 0.23 0.55 1
Common_Birds: LAGU
Site 24834
… Garrison 13453 54%
… Russell Farm 11381 46%
Station 24834
… GAR_01 4221 17%
… GAR_02 9232 37%
… RF_01 6795 27%
… RF_02 4586 18%
Confidence 24834 0.58 0.27 0.17 0.32 0.84 1
Common_Birds: LEYE
Site 9951
… Garrison 4161 42%
… Russell Farm 5790 58%
Station 9951
… GAR_01 503 5%
… GAR_02 3658 37%
… RF_01 2297 23%
… RF_02 3493 35%
Confidence 9951 0.51 0.26 0.17 0.27 0.74 1
Common_Birds: MAWR
Site 68782
… Garrison 66257 96%
… Russell Farm 2525 4%
Station 68782
… GAR_01 10603 15%
… GAR_02 55654 81%
… RF_01 2505 4%
… RF_02 20 0%
Confidence 68782 0.66 0.27 0.17 0.41 0.92 1
Common_Birds: OSPR
Site 41246
… Garrison 1185 3%
… Russell Farm 40061 97%
Station 41246
… GAR_01 857 2%
… GAR_02 328 1%
… RF_01 23109 56%
… RF_02 16952 41%
Confidence 41246 0.72 0.28 0.17 0.48 0.98 1
Common_Birds: Other
Site 97581
… Garrison 47441 49%
… Russell Farm 50140 51%
Station 97581
… GAR_01 19279 20%
… GAR_02 28162 29%
… RF_01 30254 31%
… RF_02 19886 20%
Confidence 95266 0.49 0.28 0.17 0.25 0.73 1
Common_Birds: RWBL
Site 8748
… Garrison 1172 13%
… Russell Farm 7576 87%
Station 8748
… GAR_01 945 11%
… GAR_02 227 3%
… RF_01 6298 72%
… RF_02 1278 15%
Confidence 8748 0.36 0.18 0.17 0.22 0.45 0.99
Common_Birds: SESP
Site 380311
… Garrison 189103 50%
… Russell Farm 191208 50%
Station 380311
… GAR_01 139249 37%
… GAR_02 49854 13%
… RF_01 51097 13%
… RF_02 140111 37%
Confidence 380311 0.73 0.28 0.17 0.49 0.98 1
Common_Birds: TRSW
Site 13255
… Garrison 7466 56%
… Russell Farm 5789 44%
Station 13255
… GAR_01 7455 56%
… GAR_02 11 0%
… RF_01 5485 41%
… RF_02 304 2%
Confidence 13255 0.65 0.28 0.17 0.39 0.93 1
Common_Birds: WEVI
Site 5597
… Garrison 10 0%
… Russell Farm 5587 100%
Station 5597
… GAR_01 10 0%
… GAR_02 0 0%
… RF_01 5354 96%
… RF_02 233 4%
Confidence 5597 0.52 0.25 0.17 0.29 0.73 1
Common_Birds: WILL
Site 16264
… Garrison 5198 32%
… Russell Farm 11066 68%
Station 16264
… GAR_01 2443 15%
… GAR_02 2755 17%
… RF_01 3488 21%
… RF_02 7578 47%
Confidence 16264 0.63 0.28 0.17 0.36 0.92 1
st(species_breakdown)
Summary Statistics
Variable N Mean Std. Dev. Min Pctl. 25 Pctl. 75 Max
Station 899
… GAR_01 227 25%
… GAR_02 199 22%
… RF_01 251 28%
… RF_02 222 25%
Count 899 986 7760 1 4 76 140111
Percent 899 0.44 3.4 0 0 0.04 63